Sheffield | 25-SDC-Nov | Sheida Shabankari | Sprint 2 |Improve code with precomputing#113
Conversation
cjyuan
left a comment
There was a problem hiding this comment.
Can you use complexity to explain why your implementation is an improvement?
| # Precompute prefix hashes for each string to speed up comparisons | ||
| prefix_map={s:[hash(s[:i+1]) for i in range(len(s))] for s in strings} |
There was a problem hiding this comment.
How do these "prefix hashes" speed up comparison?
There was a problem hiding this comment.
At first, I thought the exercise needed me to precompute and keep some data for each string. So I used prefix hashes, thinking this would avoid comparing letters one by one and make the function much faster.
But after testing, I saw that using hashes did not make it much faster. Sorting the strings and comparing only neighbors worked better, was simpler, and gave the speed improvement needed.
There was a problem hiding this comment.
Note: With the way you used prefix hashes, the complexity actually became higher (when the length of each string is large).
There was a problem hiding this comment.
That makes sense. In my case, prefix hashing introduced additional preprocessing and memory costs without improving the asymptotic complexity. Once I removed it and only compared adjacent strings after sorting, the solution became simpler and faster, especially for longer inputs.
|
Hi CJ,Thanks for your feedback. Previously, for each character in the string (O(n)), the code performed a membership check on the entire string (O(n)), resulting in O(n²) complexity. By precomputing a set of all lowercase characters once (O(n)) and using O(1) set lookups inside the loop, the overall complexity is reduced to O(n) . |
|
Can you also do a complexity analysis for |
|
The time complexity of find_longest_common_prefix now is O(n log n + n · m), where n is the number of strings and m is the average length of a string. |
|
How does the program determine the order between two strings? Is the performance affected by string length when we compare two strings? If we take into account |
|
Thanks for your feedback! Since sorting performs O(n log n) such comparisons, the actual cost of sorting strings is O(n log n · m), not just O(n log n). Thus, the overall time complexity is: O(n log n · m) |
|
Spot on. |
PR summary :
These changes make the function much faster while keeping all existing tests passing, including the large list test.